ggplot2 is a package included in the tidyverse that’s great for data visualization.

Today we’ll learn ggplot2 basics:

library(gapminder)
library(tidyverse)




Basic plot: ggplot() + geom_point()

Take some data and build a scatterplot.

Run ?ggplot in your console to see the help docs for ggplot. There’s a lot of info there. We learn:

Then we’ll add + geom_point() to draw a scatterplot:


Exercise

Exercise 1: Draw a scatterplot that plots year on the x-axis and lifeExp on the y-axis. Does it seem like countries have had higher life expectancies over time?




Add labels: + labs()

Next we’ll add a title and adjust the labels on the x- and y-axis.

Check out ?labs:

Good titles explain something about what your plot means. However, that oftentimes leads to long titles. Since my title was running off the page, I decided to adjust the global font size. I did that with the theme() call.

See the next section for more info on theme()!


Exercise

Exercise 2: Take the life expectancy over year boxplot from the answer to Exercise 1 and add a title, caption, and tag.





More theme()

What else can we do with theme()? Check out ?theme





Presets: theme_*








Fit a line: + geom_smooth()

geom_smooth() does smoothed conditional means. Here, it adds another layer of graphics on top of the scatterplot.

Any geom will inherit data and also aesthetic mappings from the ggplot call. So for cleaner looking code I can write this:





Scales: scale_x_log10()

The scatterplot is fan-shaped, which is a sign you might want to take the log of one (or both) of the axes. Here are 2 techniques that will lead to almost the same result.

Note the difference in the breaks on the x-axis. log10(1000) = 3, but log GDP/cap = 3 is harder to decipher than GDP/cap = 1,000.





Color to represent continent

Next I want to color the points by continent. That’s another aesthetic mapping. Just like gdpPercap is mapped to x and lifeExp is mapped to y, we can map continent to color.




Exercise

Exercise 3: Instead of mapping continent to color, map continent to shape. What’s the default shape scale?



Color to fixed value

Suppose instead of mapping continent to color, I wanted to color all the dots pink. That’s not an aesthetic mapping because you’re not taking information in the data and representing it with aesthetics in the plot. You’ll implement this by writing color = "pink" in the geom_point() call, but not wrapped with aes().





Adjust color scale: scale_color_manual()

Go back to mapping continent to color. Say I don’t like this default color scale. That’s another scale I can adjust.

continent is a factor variable with 5 levels, so I’ll need to pick out 5 colors.

Go here to pick out colors by name, like "ivory3".

I prefer to just google “color picker” and use the widget thing there to get hex codes like “#553469”.



Exercise

Exercise 4: Instead of using aes(color = continent) and adjusting the color scale, use aes(color = continent, shape = continent) and adjust the shape scale along with the color scale. Try scale_shape_manual().





Adjust transparency: alpha

Whenever points overlap a lot like this, it’s a good idea to try adjusting the transparency of the points. We can do that by setting alpha. alpha must be a number between 0 and 1. The default is 1, and the closer it is to 0, the more transparent the points are.





Point size

Now I want to adjust the size of the points. Let’s make all the points larger then smaller. To affect all points, I’ll put size outside of the aes() call.



Map pop to size

I can also map population to size, so big countries get big points and small countries get small points. To do that, I’ll put size = pop in the aes() call!




Faceting: facet_wrap()

We’re nearly done for today! One of the last things we’ll talk about is faceting. Notice we have all the years of data mashed into one plot here? Suppose I wanted to draw a different plot for each year in the dataset. There’s a way to quickly do that, and it’s called faceting.



Exercise

Exercise 5: Use facet_wrap() to facet by continent instead of year. If you wanted to see growth in GDP/capita and life expectancy over time, how would you visualize it here?



Animation: gganimate::transition_states()

Finally, instead of breaking out into many plots, we overlay the plots and create an animation! I use gganimate::transition_time here, and I also decided to replace geom_point() with geom_text().




Review

We’ve covered a lot of ground! Here are the things we’ve learned:



Resources




Assignment 3: get to know more geoms


3.3 geom_abline(), geom_vline(), and geom_hline()

You can use these three geoms to add straight lines to your plot. Take the histogram you drew in 3.2 and add a vertical line with geom_vline() at the international poverty line, currently set at $1.90 per day ($693.50 per year).